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Abstract 

The  recent  successes  of  max -plus  and  more  general  idempotent  structures  for  attacking  nonlinear 
control  problems  offer  the  potential  for  revolutionary  improvements  in  our  ability  to  design  and 
implement  nonlinear  controls  for  real  applications.  This  effort  has  focused  on  theoretical 
foundations  for  the  application  of  max -plus  arithmetic  in  stochastic  settings,  as  well  as  numerical 
methods  of  approximation. 

In  addition,  a  number  of  related  issues  in  stochastic  control  have  been  considered.  One  problem 
of  particular  interest  has  been  in  the  stability  of  estimation  in  stochastic  adaptive  control,  and 
another  has  been  in  max-plus  approaches  for  sensing,  especially  distributed  sensing. 


DISTRIBUTION  STATEMENT:  Approved  for  public  release;  distribution  is  unlimited. 


The  views  and  conclusions  contained  in  this  document  are  those  of  the  authors  and  should  not  be 
interpreted  as  representing  the  official  policies,  either  express  or  implied,  of  the  Department  of 
the  Air  Force  or  the  U.S.  Government. 


Objectives 

In  this  effort,  we  have  pursued  a  number  of  stochastic  control  problems.  Our  first  objective  has 
been  in  the  application  of  max -plus  methods  in  stochastic  control.  In  particular,  we  have  sought 
understanding  of  distributive  idempotent  techniques  that  provide  efficient  computational 
approaches  in  stochastic  control  problems. 

In  a  second  line  of  inquiry,  we  have  also  studied  some  stochastic  adaptive  control  problems, 
specifically  problems  of  adaptive  disturbance  cancellation.  These  problems  arise  in  beam  control 
for  high-energy  laser  applications.  We  have  maintained  close  collaboration  with  AFRL  and 
Boeing  Direct  Energy  Systems  scientists  and  engineers  in  studying  these  applications. 

Background 


The  max -plus  algebra  involves  a  redefinition  of  arithmetic  operations,  for  computational  and 
analytical  benefit.  The  basic  scalar  set  of  interest  is  the  real  numbers,  augmented  by  —  co : 

R  =  R  VJ  {—  00} .  On  this  set,  two  operations,  ©  and  <8> ,  are  defined  by 

a  ©  b  =  max  {«,£>}, 
a®b  =  a  +  b. 

It  is  well  known  that  R  -  forms  a  commutative  semi -ring  under  these  operations.  The  additive 
identity  is  —  go  while  the  multiplicative  identity  is  0.  Except  for  the  additive  identity,  every 
element  has  a  multiplicative  inverse,  suggesting  that  one  might  be  able  to  extend  the  structure  to  a 
field  structure.  However,  addition  in  this  semi-ring  is  idempotent,  meaning  that  a®  a  =  a.  It  is 
important  to  note  that  the  only  rings  satisfying  additive  idempotency  are  trivial;  that  is,  the  only 
element  is  the  additive  identity.  Thus,  extending  the  semi-ring  to  a  ring  (and  hence  a  field)  is  not 
a  possibility. 

From  these  basic  operations,  standard  linear  algebraic  objects  can  be  built,  such  as  matrices  and 
vectors. 


To  illustrate  the  application  of  max -plus  algebraic  structure,  we  consider  a  standard  nonlinear 
control  problem.  We  begin  with  a  dynamical  system  under  control,  of  the  form 

x  =  f(x,u),  x(t0)  =  x0 


with  a  control  objective  given  by 


'/ 

J(u,  x0,t0)  =  J  g(x(t), u(t))dt , 

h 


which  is  to  be  maximized  of  the  set  of  admissible  control  functions,  u  e  U (t0 ,  t  f  )  a  L  (t0 ,  tf  ). 
For  a  given  control  function  and  initial  state,  we  denote  the  solution  of  the  differential  equation 
by  x(»;t0,x0,  u)  .  The  Bellman  equation  of  dynamic  programming,  given  by 


V(y,t )  =  max 


J  g(x(r),  u(r))d  r  +  V  (x(s;  t,  y,  u),  s ) 


S,t  ’ 


leads  to  a  family  of  operators  S 


Sst(0)  =  max 


J g(x(z),u(z)dz  +  <f)(x(s;t,  y,u )) 


=  ®{G{u)®Lu{(/>)} 


which  are  max -plus  linear  evolution  operators. 

The  dynamic  programming  propagation  operator  in  the  stochastic  case, 


Sst  (</>)  =  max<  E 


j 

}  g(X  (r ),  u(z)d z  +  (j){X  {s\  t,  y,  u )) 


however,  is  not  max -plus  linear.  However,  we  may  use  the  distributive  property  of  multiplication 
over  addition,  to  apply  max -plus  effectively. 


The  max -plus  distributivity  principle  we  have  developed  is  of  the  form 

\  sup  h(w,z)P(dw)  =  sup  |  h( w,z)dQ(dwdz), 

i  zez  QeP(WxZ)l 

in  which  P(  W,Z)  denotes  the  set  of  probability  measures  on  W  x  Z  having  marginal  P  on  W. 


The  relevance  of  this  result  arises  in  conjunction  with  max -plus  finite  element  approximations  to 
the  Bellman  equations. 


Max-Plus  Finite  Element  Approximations 


Max-plus  finite  elements  provide  useful  approximation  tools  for  dynamic  programming.  Within 
the  context  of  this  work,  we  use  the  finite  elements  in  conjunction  with  max-plus  distributivity  to 
approximate  solutions  to  the  stochastic  Bellman  dynamic  programming  equation.  We  have 
considered  the  linear  elements 


quadratic  elements 


and  Legendre  elements 


Yi(x)  =  -ci\x-xi\, 
V'i(x)  =  -c,\x-xi\2, 
iPiix)  =  pj x-c\x\~ . 


In  the  linear  and  quadratic  elements,  x(  are  the  element  nodes,  and  ci  are  scale  parameters.  In  the 
Legendre  formulation,  the  elements  A  max-plus  approximation  of  a  function /takes  the  form 

f(x)  ~®ak®  y/k  (x)  =  maxja^.  +  y/k  (x)} 

k= 1  k 

in  which  the  weights  a,  are  defined  by 

a,.  =-maxV,(x)-/(x)}. 

Note  that  a  max -plus  interpolation  has  an  interesting  and  perhaps  unintuitive  structure.  Figure  1 
below  illustrates  the  projection  onto  linear  elements  for  an  example  function. 


Figure  1.  Blue  curve  is  the  original  function;  red  curve  is  the  max-plus  finite  element  projection. 


To  apply  the  finite  element  method  to  dynamic  programming,  we  first  examine  the  deterministic 
Bellman  equation 

V(y,t )  =  maxjj"  g(x(r),u(r))dT  +  V(x(s;t,  y,w),x)| 

which  is  max-plus  linear.  We  plug  in  the  finite  element  expansion 

N 

VN(x,t)  =  ®  ak  (t)  0 iff  k  (x)  =  max  { ak  ( t )  +  y/k  (x)} 

k= 1  k 

into  the  Bellman  equation,  which  leads  to  the  max -plus  matrix  iteration 

a(t )  =  B  0  a{t  +  h )  , 

in  which  the  matrix  B  is  defined  by 

B  .  =  -  max  \//i  (x)  -  Sl  t+h  (y/ }  )(x)} . 

Our  contribution  on  the  deterministic  side  has  been  the  introduction  of  the  Legendre  elements, 
which  provide  second  order  accuracy  of  approximation.  Our  focus,  however,  has  been  on  the 
stochastic  control  area. 

Traditional  Stochastic  Control  Problems 

The  standard  stochastic  control  model  is  of  the  form 

dX  =  f(X,u)dt  +  a(X)dW,  X(t0)  =  X0 

in  which  W  is  a  standard  Brownian  motion,  and  the  standard  running  cost  criterion  is  given  by 

J  g(X(t),u(t))dt 

fo 


J  (u,x0,t0)  =  E 


which  is  to  be  maximized  over  admissible  controls 


u  g  U(t0,tf)  cr  |m  :  \t0,tf\  — »  R"!  |  <  co ,u  progressively  measureablej. 

The  Bellman  equation  of  dynamic  programming  (DPE)  takes  the  form 


V(y,t)=  max  <  E\ 

ueU  ( t,s ) 


J  g(X(r),  u(r))d z  +  V(X(s;t,  y,  u),  s) 


in  which  V  is  the  value  function 


V(x0,t0)=  max  (. J(u,x0,t0 ))• 

u<=U(t0,tf) 


The  semigroup  for  backward  propagation, 


^(^)  =  max1£ 


a 

J  g(X  (z),  u(r)d z  +  </>(X  (s;  t,  y,  u  j) 


is  not  (necessarily)  linear,  because  the  maximization  and  expectation  cannot  be  interchanged  in 
order.  However,  max -plus  distributivity  allows  us  some  flexibility. 


The  finite  element  expansion  plugged  into  the  semigroup  yields 

i  i  r  N 

Ss,(VN)  =  max<J  E\  j  g{X(j\u(j)dr  +  ®ai(s)  0  i//i(X(s;t,  y,u )) 


in  v  t 

Applying  distributivity,  we  have 


=  ©j  J®  j  g{X{z),u{z)dz  0  ai(s)®y/j(X(s;t,  y,u )) 


SsAyN)(y)=  ®  ®  J  f \ g(XTMT)dr® \az(t)®y/z(X(t;s,y,u))Q((D,z)\ 

uGU(s,t)  QeP(ClxN)  \  J  J  J 

s  Q  J 

in  which  the  distributivity  property  of  the  theorem  involves  the  set  of  random  variables  taking 
values  in  the  set  N={1,2,  . ..,  N}  for  the  interchange  of  expectation  and  maximization  order.  This 
propagation  is  then  projected  back  onto  the  finite  element  basis  through  the  relation  as  in  the 
deterministic  situation.  It  is  interesting  to  note  that  the  optimization  over  Q  is  essentially  a  linear 
programming  problem. 


An  alternative  to  applying  distributivity  involves  the  fast  Legendre  transform.  Given  a 
discretized  function  on  a  grid  in  R,  {(*. ,/'  ):()</<  «},  and  discrete  slope  parameters 

[p.  :  0  <  /  <  m),  we  compute  the  approximate  gradients 
{xj’8j\  gj=(yj+l-yj)/(xj+l-xj),  j  =  0, •••,«- 1 

and  merge  the  two  sequences  g  and  p.  The  interval  (p,  ,  p,+1) that  contains  g ;  provides  the  output 

slope  of  the  discrete  Legendre  transform.  If  the  sequences  are  pre -sorted,  this  merge  takes 
0(n  +  m)  operations.  Similar  algorithms  exist  for  higher  dimensional  problems. 


Thus  the  fully  discrete  Legendre  finite  element  algorithm  consists  of  the  following  steps. 

1 .  Identify  the  preconditioning  quadratic  function  that  makes  the  value  convex. 

2.  Initialize  at  the  final  time  T  with  value  V=0  (or  a  non-zero  exit  time  cost). 

3.  Back  propagate  the  value  function  on  a  discrete  grid  of  points  to  the  next  earlier  time. 


4.  Re -project  the  value  function  onto  the  Legendre  basis  using  the  fast  discrete  Legendre 
transform. 

5.  Return  to  3  and  repeat  until  the  initial  time  is  reached. 


Stochastic  Adaptive  Control  Problems 

Another  focus  of  this  research  project  has  been  in  the  analysis  of  adaptive  control  problems  that 
arise  in  pointing  and  tracking  for  high-energy  lasers.  Generally  speaking,  these  systems  comprise 
a  large,  complex  set  of  mechanical  and  electronic  components.  The  primary  goal  in  tracking  is  to 
maintain  the  target  of  interest  in  the  center  of  the  focal  plane  of  a  tracking  camera.  At  the  coarse 
physical  scale  of  operation,  this  goal  is  attacked  with  a  gimbal  that  rotates  the  telescope  of  the 
optical  pointing  and  tracking  system.  At  the  fine  scale,  actuation  is  achieved  with  a  fine  track 
mirror  that  is  controlled  at  a  much  higher  frequency.  In  the  material  below,  we  discuss  the  fine 
track  problem. 

The  fine  track  system  is  typically  characterized  with  a  linear  time  invariant  model,  instantiated  as 
a  discrete  time  rational  transfer  function.  Of  course  the  actual  plant  is  the  hardware:  the  model 
provides  a  means  of  devising  control  actions.  A  block  diagram,  denoting  the  operation  of  the  fine 
track  loop  is  given  in  Figure  2.  The  process  involves  an  actual  plant  (that  is,  the  hardware  system 
to  be  controlled),  a  model  of  the  plant,  the  controller,  and  the  adaptive  processing  that  tunes  the 
gains  of  the  controller  to  cancel  the  disturbance. 


Augmenting  control 
u 


Figure  2.  Adaptive  control  block  diagram. 

The  actual  plant  is  unknown  and  instantiated  in  hardware.  The  model  plant  is  a  mathematical  and 
computational  approximation  of  the  actual  plant  that  must  be  estimated  experimentally.  The 
augmenting  controller  attempts  to  produce  a  control  signal  that  will  cancel  out  the  disturbance,  vv, 
and  drive  the  output  y  to  0,  using  adaptive  tuning  of  the  tap  weights  or  gains  in  an  FIR  filter. 

The  model,  in  equation  form,  is  given  by 


y  =  Gu  +  w  =  actual  hardware  system 
w  =  y  —  Gu  =  estimate  of  disturbance 
u  —  ~z~lFw  =  adaptive  controller 


The  model  of  the  plant  provides  a  means  of  estimating  the  disturbance  and  determining  the 
appropriate  control  with  which  to  cancel  it  out.  The  control  gains  embodied  in  the  fdter  F  are 
determined  by  minimizing 


J(n  =  my 


(/  +  z~lGF)y\, 


2 


in  which 


y  =  w  +  Gu  =  (/  +  z  lGF)w 

yields  an  estimate  of  the  closed-loop  noise-to-track  error  transfer  function.  Note  that  if  we  were 
to  have  estimated  the  transfer  function  perfectly,  ( G  =  G  ),  then  this  relationship  would  provide 

exactly  the  noise-to-output  transfer  function.  We  should  also  note  that  if  F  =  —zG  1  were 
realizable,  then  this  feedback  operator  would  exactly  cancel  the  transfer  function.  Since  such  an 
F  is  not  causal  (requiring  at  least  one  step  of  future  data),  the  minimizing  solution  must  depend  on 
the  disturbance.  We  use  recursive  least  squares  techniques  to  estimate  the  fdter  F. 

With  George  Yin  and  Le  Yi  Wang  of  Wayne  State,  we  have  developed  conditions  under  which 
least  squares  estimators  converge,  even  under  model/plant  mismatch.  These  preliminary  results 
are  among  the  first  approaches  to  analyzing  the  stability  of  these  adaptive  loops,  a  key  issue  to 
engineers  building  high  energy  laser  systems. 

The  basic  problem  can  be  written  as 

y„  =f„o+&  (<!>„, 0)+en 

in  which  A  denotes  the  model  mismatch,  which  may  depend  on  the  “true”  parameter  as  well  as 
the  exogenous  sequence  (j).  Applying  a  traditional  adaptive  estimation  algorithm  of  the  form 

0  =0  +a  tf>  (y  —  <bT 0  ) 

n+ 1  n  nrn\Jn  rn  n  s 

leads  to  a  biased  estimator  whose  convergence  properties  are  uncertain.  Using  weak  convergence 
techniques,  we  have  shown  that  sufficiently  small  plant  perturbations  lead  to  convergence  of  the 
adaptive  estimation  technique. 

We  should  note  that  this  technology  has  been  involved  in  a  number  of  transition  efforts.  In  fact 
the  actual  application  of  the  recursive  estimation  algorithms  has  preceded  the  theoretical 
understanding  of  stability:  these  algorithms  have  been  applied  by  our  team  at  White  Sands 
Missile  Range  and  Kirtland  Air  Force  Base  in  laboratory  field  experiments,  prototype  strategic 
relay  systems,  and  prototype  tactical  laser  tracking  platforms.  Our  technical  point  of  contact  at 
AFRL  has  been  Dr.  Dan  Herrick,  and  we  have  worked  closely  with  Steve  Baugh  of  Boeing 
Directed  Energy  Systems  in  transition  and  testing  as  well. 
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